Perception of Personality and Naturalness through Dialogues by 
Native Speakers of American English and Arabic 



Maxim Makatchev 

Robotics Institute 
Carnegie Mellon University 
Pittsburgh, PA, USA 

mmakatch@cs . emu . edu 



Reid Simmons 

Robotics Institute 
Carnegie Mellon University 
Pittsburgh, PA, USA 

reids@cs . emu . edu 



Abstract 

Linguistic markers of personality traits have 
been studied extensively, but few cross- 
cultural studies exist. In this paper, we eval- 
uate how native speakers of American English 
and Arabic perceive personality traits and nat- 
uralness of English utterances that vary along 
the dimensions of verbosity, hedging, lexical 
and syntactic alignment, and formality. The 
utterances are the turns within dialogue frag- 
ments that are presented as text transcripts to 
the workers of Amazon's Mechanical Turk. 
The results of the study suggest that all four di- 
mensions can be used as linguistic markers of 
all personality traits by both language commu- 
nities. A further comparative analysis shows 
cross-cultural differences for some combina- 
tions of measures of personality traits and nat- 
uralness, the dimensions of linguistic variabil- 
ity and dialogue acts. 

1 Introduction 

English has been used as a lingua franca across the 
world, but the usage differs. The variabilities in En- 
glish introduced by dialects, cultures, and non-native 
speakers result in different syntax and words ex- 
pressing similar meanings and in different meanings 
attributed to similar expressions. These differences 



are a source of pragmatic failures (Thomas, 19831: 
situations when listeners perceive meanings and af- 
fective attitudes unintended by speakers. For exam- 



ple, Thomas (1984) reports that usage of lUocution- 



ary Force Indicating Devices (IFIDs, such as "I warn 



appear "inappropriately domineering in interactions 
with English-speaking equals." Dialogue systems, 
just like humans, may misattribute attitudes and mis- 
interpret intent of user's utterances. Conversely, they 
may also cause misattributions and misinterpreta- 
tions on the user's part. Hence, taking into account 
the user's dialect, culture, or native language may 
help reduce pragmatic failures. 

This kind of adaptation requires a mapping from 
utterances, or more generally, their linguistic fea- 
tures, to meanings and affective attributions for each 
of the target language communities. In this paper 
we present an exploratory study that evaluates such 
a mapping from the linguistic features of verbosity, 
hedging, alignment, and formality (as defined in 



Section 3.11 to the perceived personality traits and 
naturalness across the populations of native speak- 
ers of American English and Arabic. 

Estimating the relationship between linguistic 
features and their perception across language com- 
munities faces a number of methodological difficul- 
ties. First, language communities shall be outlined, 
in a way that will afford generalizing within their 
populations. Defining language communities is a 
hard problem, even if it is based on the "mother 



tongue" (McPherson et al., 2000j ). Next, linguistic 
features that are potentially important for the adap- 
tation must be selected. These are, for example, 
the linguistic devices that contribute to realization of 



you", ( Searle, 1969| )) in English by native speak- 
ers of Russian causes the speakers to sometimes 



rich points (Agar, 1994 1, i.e. the behaviors that sig- 
nal differences between language communities. To 
be useful for dialogue system research, the selected 
linguistic features should be feasible to implement in 
natural language generation and interpretation mod- 



ules. Then, a corpus of stimuli that span the variabil- 
ity of the linguistic features must be created. The 
stimuli should reflect the context where the dialogue 
system is intended to be used. For example, in case 
of an information-giving dialogue system, the stim- 
uli should include some question-answer adjacency 
pairs ( Schegloff and Sacks, 1973] ). Finally, scales 
should be chosen to allow for scoring of the stimuli 
with respect to the metrics of interest. These scales 
should be robust to be applied within each of the lan- 
guage communities. 

In the remainder of this paper, we describe each of 
these steps in the context of an exploratory study that 
evaluates perception of English utterances by native 
speakers of American English and Arabic. Our ap- 
plication is an information-giving dialogue system 
that is used by the robot receptionists (roboception- 
ists) in Qatar and the United States ( [Makatchev et 



English the formula "get well soon" is not generally 



al, 2009 Makatchev et al, 2010). In the next sec- 



tion, we continue with an overview of the related 
work. Section |3] introduces the experiment, includ- 
ing the selection of stimuli, measures, design, and 
describes the recruitment of participants via Ama- 
zon's Mechanical Turk (MTurk). We discuss results 
in Section|4]and provide a conclusion in Section[5] 

2 Related work 

2.1 Cross-cultural variability in English 



Language is tightly connected with culture (Agar, 
[1994| ). As a result, even native speakers of a lan- 
guage use it differently across dialects (e.g. African 
American Vernacular English and Standard Amer- 



ican English), genders (see, for example, (Lakoff, 



1973 1) and social statuses (e.g. (Huspek, 1989 1), 



among other dimensions. 

Speakers of English as a second language display 
variabilities in language use that are consistent with 
their native languages and backgrounds. For exam- 
ple, Nekonetal^l996) reports that Syrian speakers 
of Arabic tend to use different compliment response 
strategies as compared with Americans. 



used in speech. Feghali (1997 1 reviews features of 
Arabic communicative style, including indirectness 
(concealment of wants, needs or goals ( [Gudykunst 
and Ting-Toomey, 19881), elaborateness (rich and 
expressive language use, e.g. involving rhetorical 
patterns of exaggeration and assertion ( Patai, 1983] )) 
and affectiveness (i.e. "intuitive-affective style of 



emotional appeal" (Glenn et al., 19771, related to 
the patterns of organization and presentation of ar- 
guments). 

In this paper, we are concerned with English us- 
age by native speakers of American English and na- 
tive speakers of Arabic. We have used the features 
of the Arabic communicative style outlined above 
as a guide in selecting the dimensions of linguistic 



variability that are presented in Section 3.1 



2.2 Measuring pragmatic variation 

Perception of pragmatic variation of spoken lan- 
guage and text has been shown to vary across 
cultures along the dimensions of personality 



(e.g. (Scherer, 19721)), emotion (e.g. (Burkhardt et 



al, 20061), deception (e.g. (Bond et al, 1990 1), 



among others. Within a culture, personality traits 
such as extraversion, have been shown to have 
consistent markers in language (see overview in 



(Mairesse et al., 20071). For example, |Furnham 



|(1990p notes that in conversation, extraverts are less 
formal and use more verbs, adverbs and pronouns. 
However, the authors are not aware of any quantita- 
tive studies that compare linguistic markers of per- 
sonality across cultures. The present study aims to 
help fill this gap. 

A mapping between linguistic dimensions and 
personality has been evaluated by grading es- 



says and conversation extracts (Mairesse et al., 



Aguilar 2007 1, and by grading utterances generated automat- 



(1998) reviews types of pragmatic failures that are ically with a random setting of linguistic parame- 



influenced by native language and culture. In partic- 



ular, he cites Davies (1987 1 on a pragmatic failure 



due to non-equivalence of formulas : native speakers 
of Moroccan Arabic use a spoken formulaic expres- 
sion to wish a sick person quick recovery, whereas in 



ters (Mairesse and Walker, 2008 1. In the exploratory 
study presented in this paper, we ask our participants 
to grade dialogue fragments that were manually cre- 
ated to vary along each of the four linguistic dimen- 



sions (see Section 3.1 1 



3 Experiment 

In the review of related work, we presented some ev- 
idence supporting the claim that Unguistic markers 
of personality may differ across cultures. In this sec- 
tion, we describe a study that evaluates perception 
of personality traits and naturalness of utterances by 
native speakers of American Enghsh and Arabic. 

3.1 Stimuli 

The selection of stimuli attempts to satisfy three ob- 
jectives. First, our application: our dialogue system 
is intended to be used on a robot receptionist. Hence, 
the stimuli are snippets of dialogue that include four 
dialogue acts that are typical in this kind of em- 



bodied information-giving dialogue (Makatchev et 



al, 20091: a greeting, a question-answer pair, a dis- 



agreement (with the user's guess of an answer), and 
an apology (for the robot not knowing the answer to 
the question). 

Second, we would like to vary our stimuli along 
the linguistic dimensions that are potentially strong 
indicators of personality traits. Extraverts, for exam- 
ple, are reported to be more verbose (use more words 
per utterances and more dialogue turns to achieve 



the same communicative goal), less formal (Fum 



|ham, 1990 1 (in choice of address terms, for exam- 
ple), and less likely to hedge (use expressions such 
as "perhaps" and "maybe") ( |Nass et al., 1995| l. Lex- 
ical and syntactic alignment, namely, the tendency 
of a speaker to use the same lexical and syntactic 
choices as their interlocutor, is considered, at least 
in part, to reflect the speaker's co-operation and will- 



ingness to adopt the interlocutor's perspective (Hay 



[wood et al., 2003| ). There is some evidence that the 
degree of alignment is associated with personality 



traits of the speakers ( |Gill et al., 2004[ ). 

Third, we would like to select linguistic dimen- 
sions that potentially expose cross-cultural differ- 
ences in perception of personahty and naturalness. 
In particular, we are interested in the linguistic de- 
vices that help realize rich points (the behaviors that 
signal differences) between the native speakers of 
American English and Arabic. We choose to real- 
ize indirectness and elaborateness, characteristic of 
Arabic spoken language ( [Feghali, 1997| ), by vary- 
ing the dimensions of verbosity and hedging. High 
power distance, or influence of relative social status 



on the language ( [Feghali, 1997 1, can be realized by 
the degrees of formality and alignment. 

In summary, the stimuli are dialogue fragments 
where utterances of one of the interlocutors vary 
across (1) dialogue acts: a greeting, question-answer 
pair, disagreement, apology, and (2) four linguistic 
dimensions: verbosity, hedging, alignment, and for- 
mality. Each of the linguistic dimensions is parame- 
terized by 3 values of valence: negative, neutral and 
positive. Within each of the four dialogue acts, stim- 
uli corresponding to the neutral valences are repre- 
sented by the same dialogue across all four linguistic 
dimensions. The four linguistic dimensions are real- 
ized as follows: 

• Verbosity is realized as number of words within 
each turn of the dialogue. In the case of the 
greeting, positive verbosity is realized by in- 
creased number of dialogue tums{^ 

• Positive valence of hedging implies more ten- 
tative words ("maybe," "perhaps," etc.) or ex- 
pressions of uncertainty ("I think," "if I am 
not mistaken"). Conversely, negative valence 
of hedging is realized via words "sure," "defi- 
nitely," etc. 

• Positive valence of alignment corresponds to 
preference towards the lexical and syntactic 
choices of the interlocutor. Conversely, neg- 
ative alignment implies less overlap in lexical 
and syntactic choices between the interlocu- 
tors. 

• Our model of formality deploys the follow- 
ing linguistic devices: in-group identity mark- 



ers that target positive face ( Brown and Levin 



son, 19871 such as address forms, jargon and 



slang, and deference markers that target nega- 
tive face, such as "kindly", terms of address, 
hedges. These devices are used in Arabic po- 



liteness phenomena (Farahat, 20091, and there 
is an evidence of their pragmatic transfer from 



Arabic to English (e.g. (Bardovi-Harlig et al., 



[20071 ) and ( [Ghawi, I993| ). 



The complete set of stimuli is shown in Tables [2]-[6[ 
Each dialogue fragment is presented as a text on 
an individual web page. On each page, the partici- 



The multi-stage greeting dialogue was developed via 
ethnographic studies conducted at Alelo by Dr Suzanne 
Wertheim. Used with permission from Alelo, Inc. 



pant is asked to imagine that he or she is one of the 
interlocutors and the other interlocutor is described 
as "a female receptionist in her early 20s and of 
the same ethnic background" as that of the partici- 
pant. The description of the occupation, age, gender 
and ethnicity of the interlocutor whose utterances 
the participant is asked to evaluate should provide 
minimal context and help avoid variability due to the 
implicit assumptions that subjects may make. 

3.2 Measures 

In order to avoid a possible interference of scales, 
we ran two versions of the study in parallel. In 
one version, participants were asked to evaluate the 
receptionist's utterances with respect to measures 



of the Big Five personality traits (John and Srivas 



tava, 1999 1, namely the traits of extraversion, agree- 
ableness, conscientiousness, emotional stability, and 
openness, using the ten-item personality question- 
naire (TIPI, see ( Goshng et al., 2003] )). In the other 
version, participants were asked to evaluate the re- 
ceptionist's utterances with respect to their natu- 
ralness on a 7-point Likert scale by answering the 
question "Do you agree that the receptionist's utter- 
ances were natural?" The variants of such a natural- 



ness scale were used by [Burkhardt et al. (2006 1 and 



Mairesse and Walker (2008 1. 



3.3 Experimental design 

The experiment used a crossed design with the fol- 
lowing factors: dimensions of linguistic variability 
(verbosity, hedging, alignment, or formality), va- 
lence (negative, neutral, or positive), dialogue acts 
(greeting, question-answer, disagreement, or apol- 
ogy), native language (American English or Arabic) 
and gender (male or female). 

In an attempt to balance the workload of the par- 
ticipants, depending on whether the participant was 
assigned to the study that used personality or nat- 
uralness scales, the experimental sessions consisted 
of one or two linguistic variability conditions — 12 
or 24 dialogues respectively. Hence valence and dia- 
logue act were within-subject factors, while linguis- 
tic variability dimension were treated as an across- 
subject factor, as well as native language and gen- 
der. Within each session the items were presented in 
a random order to minimize possible carryover ef- 
fects. 







N 


Arabic 


Algeria 


1 




Bahrain 


1 




Egypt 


56 




Jordan 


32 




Morocco 


45 




Palestinian Territory 


1 




Qatar 


1 




Saudi Arabia 


5 




United Arab Emirates 


13 




Total 


155 


American English 


United States 


166 



Table 1 : Distribution of study participants by country. 
3.4 Participants 

We used Amazon's Mechanical Turk (MTurk) to re- 
cruit native speakers of American English from the 
United States and native speakers of Arabic from 
any of the set of predominantly Arabic-speaking 
countries (according to the IP address). 

Upon completion of each task, participants re- 
ceive monetary reward as a credit to their MTurk ac- 
count. Special measures were taken to prevent mul- 
tiple participation of one person in the same study 
condition: the study website access would be re- 
fused for such a user based on the IP address, and 
MTurk logs were checked for repeated MTurk user 
names to detect logging into the same MTurk ac- 
count from different IP addresses. Hidden questions 
were planted within the study to verify the fluency 
in the participant's reported native language. 

The distribution of the participants across coun- 
tries is shown in Table [T] We observed a regional 
gender bias similar to the one reported by Ross et alT| 



(20101: there were 100 male and 55 female partici- 
pants in the Arabic condition, and 63 male and 103 
female participants in the American English condi- 
tion. 

4 Results 

We analyzed the data by fitting linear mixed-effects 



(LME) models ( [Pinheiro and Bates, 2000) 1 and per- 
forming model selection using ANOVA. The com- 
parison of models fitted to explain the personality 
and naturalness scores (controlling for language and 
gender), shows significant main effects of valence 
and dialogue acts for all pairs of personality traits 



(and naturalness) and linguistic features. The results 
also show that for every personality trait (and nat- 
uralness) there is a linguistic feature that results in 
a significant three-way interaction between its va- 
lence, the native language, and the dialogue act. 
These results suggest that (a) for both language com- 
munities, every linguistic dimension is associated 
with every personality trait and naturalness, for at 
least some of the dialogue acts, (b) there are differ- 
ences in the perception of every personality trait and 
naturalness between the two language communities. 

To further explore the latter finding, we conducted 
a post-hoc analysis consisting of paired t-tests that 
were performed pairwise between the three values of 
valence for each combination of language, linguis- 
tic feature, and personality trait (and naturalness). 
Note, that comparing raw scores between the lan- 
guage conditions would be prone to find spurious 
differences due to potential culture-specific tenden- 
cies in scoring on the Likert scale: (a) perception 
of magnitudes and (b) appropriateness of the inten- 
sity of agreeing or disagreeing. Instead, we compare 
the language conditions with respect to (a) the rela- 
tive order of the three valences and (b) the binarized 
scores, namely whether the score is above 4 or be- 
low 4 (with scores that are not significantly different 
from 4 excluded from comparison), where 4 is the 
neutral point of the 7 -point Likert scale. 

The selected results of the post-hoc analysis are 
shown in Figure [T] The most prominent cross- 
cultural differences were found in the scoring of 
naturalness across the valences of the formality di- 
mension. Speakers of American English, unlike the 
speakers of Arabic, find formal utterances unnatu- 
ral in greetings, question-answer and disagreement 
dialogue acts. Formal utterances tend to also be 
perceived as indicators of openness and conscien- 
tiousness by Arabic speakers, and not by American 
English speakers, in disagreements and apologies 
respectively. Finally, hedging in apologies is per- 
ceived as an indicator of agreeableness by American 
English speakers, but not by speakers of Arabic. 

Interestingly, no qualitative differences across 
language conditions were found in the perception 
of extraversion and stability. It is possible that this 
cross-cultural consistency confirms the view of the 
extraversion, in particular, as one of most consis- 



and Oberlander, 2002 1). It could also be possi- 
ble that our stimuli were unable to pinpoint the 
extraversion-related rich points due to a choice of 
the linguistic dimensions or particular wording cho- 
sen. A larger variety of stimuli per condition, and an 
ethnography to identify potentially culture-specific 
linguistic devices of extraversion, could shed the 
hght on this issue. 

5 Conclusion 

We presented an exploratory study to evaluate a set 
of linguistic markers of Big Five personality traits 
and naturalness across two language communities: 
native speakers of American English living in the 
US, and native speakers of Arabic living in one 
of the predominantly Arabic-speaking countries of 
North Africa and Middle East. The results suggest 
that the four dimensions of linguistic variability are 
recognized as markers of all five personality traits by 
both language communities. A comparison across 
language communities uncovered some qualitative 
differences in the perception of openness, conscien- 
tiousness, agreeableness, and naturalness. 

The results of the study can be used to adapt nat- 
ural language generation and interpretation to native 
speakers of American English or Arabic. This ex- 
ploratory study also supports the feasibility of the 
crowdsourcing approach to validate the linguistic 
devices that realize rich points — behaviors that sig- 
nal differences across languages and cultures. 

Future work shall evaluate effects of regional di- 
alects and address the issue of particular wording 
choices by using multiple stimuli per condition. 
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Apology 


A: Could you tell me where 
the library is? 
B: I don't know. 


A: Could you tell me where 
the library is? 

B: I am so sorry, I don't re- 
ally know where the Ubrary 
is in this building. 


Disagreement 


A: Could you tell me where 

the library is? 

B : Second floor. 

A: I thought it was on the 

first floor. 

B: No, it is not. 


A: Could you tell me where 
the library is? 

B: Yes, the library is on the 
second floor. 

A: I thought it was on the 
first floor. 

B: No, there is no library on 
the first floor of this build- 
ing. 


Question-Answer 


A: Could you tell me where 
the library is? 

B: End of the hallway on 
your left. 


A: Could you tell me where 
the Ubrary is? 

B: Yes, just follow this hall- 
way until it ends and you 
will find the library on your 
left hand side. 


Greeting 


A: Good morning. 

B: Morning. May I help you? 


A: Good morning. 

B: Good morning. How are 

you today? 

A: I am doing well, thanks. 
You? 

B: Very well, thank you. 
How's your family? 
A: Everyone is doing fine, 
thanks. How about yours? 
B: Mine is doing well too. 
How may I help you? 


Dimension and 
valence 


Verbosity 
negative 


Verbosity 
positive 



as 



o 
a 

si 



3 

o .2 « 

2 J3 ^ 



o 



< S m 



3: 



3 E 

o .2 c3 

^ >^ " 

3 in H 

U 00 

. . <u . . 

< S m 



s 

OS 



<: 



5 



o 

ID 



o 



P c/5 !S 

3 ;h oj o 



^ ^ P - 

< ^ m ^ 



3 
O 

5 



o 

3 



5 



§ .2 

-a b 

o ^ 

U ^ 

. . <u 

< 6 



3 

o 



3 

<u 

3 
O 



o 
pq 



o 

■a • 

3 <+H 



o 

>^ 

3 

O 



PQ 6 



si 



o 
o 



3 

o 



3 g .S 

U ;3 - 
< -3 m 



S 3 



s 

■t-> 
3 

o 
>> 

2 

"3 

O 

u 



.52 ^ 

'-^ '3 

D - . 

Si 



3 
O 



Si 

3 

O 



.2 'S 



5 § 



o 
o 



o 



pq ■S « 



T3 . 

3 

O 

>^ 

Si 



c 

c 

S 



c 

§ 



3 

a 



f2 



<C 6 PQ 6 



Ml 

a 

o 



■73 
3 

OS 



-a 

o 
o 

3 



3 

Q 



o 



< pq 



3 
O 

3 ^ 
O 

>> 0) 

1 ^ 

2 ^ 
o o 

O ffi 



3 

O 
>. 

■T3 
3 

(Si 



3 
O 



3 3 

O O 



■73 2 



O 3 



^ O - - 

O O O o 

ffi O O U 

< pq < pq 



3 

3 o 

a 3 

S ^ 

Q 5 



3 > 



.3 > 



I 

§ 1 

^ s 

O 53 

u 

. . to 

<1- 43 



I 

■t-> 

"c 

Td S 

M o 

O U 

. . <o 

m -3 



§ 1 



o 



T3 p 



3 

O 

u 



s 

OS 



<: 

3 



3 

O 



s 

o 

2 2 

3 -B 

U X 
. . D 



5 



3 -3 



<C 6 PQ 



o 






ID 
t/1 










om 


C 




o 


O 




M 


(/I 




bat 


S 






o 






o 












res 


o 


ba 

3 




flO( 


O 

-4— > 




-a 




m 


on 












no 












a 






o 






>^ 




om 


Iwa 




o 






h 


j3 






















Th 


O 




m 


end 





3 
O 

S 

o 
o 

s-t 

o 

3 



B 



c 

o 



^ 2 § 

* « 

u o ^ 

^ Z; ^ 

o pq 6 < 



Is 

2 2 

3 3 

OS 
<D 



o 
U 



o 
o 

H 



PS 

B 
o 
o 

I 

OJ o 

« in 
o 



B 




o 




o 




1 








o 




3 






O 


OJ 


flo 


<u 






y] 






o" 


(U 






pq 


3 


O 



c 
e 

lU 

a 



i 

e 



B 



3 

o 
>^ 

a 
o 



^ -23 S ;= 



o 

O -C 

U H 

. . <o .. 

<C 6 PQ 



T3 

c 



-a 
c 

03 

13 
O 

■>< 

■s 

3 
O 

1 



Ml 
^3 

o 



o 
X 



bO 

O ):i ^ 
>^ 3 

^ 8 

ffi ^ 
. . . . <^ 

< m B 



3 
O 



3 
O 

i 

O 



o 



4= 



O 

< pq ffi 



13 
§ 
3 

3 o 
a 3 

B S 
Q 5 



c 

si 

3 -C 

< s 



3 
3 

<1 



ID (U 

U > 

3 H c« 

U c/^ 3 

<C ■£ m § 



N 



3 

o .2 

u ;S 



o 



< ■£ m -a 



S 

OS 



<: 



> -A 
^ o 
(u o 

a 

— C 
O 

o 

<: m 



c 
o 



O 



o 
o 



o 

<u 

o 



K o 

c o 

g !T3 

CD 
4= 



CD 



cu 



■"^ I— i 



H (U 
c/2 



ce 
o 



5 



4= 



o o 
^ o 



M O 



c^ _ 

o 
o 

03 « 



c 
a 

a 

'■3 



5 



3 

o .2 



o 

4= 



' — 



s 



U 

. . w . . 

< 6 PQ 



o 

3 

O 
>. 

>> 



C3 
4= 



3 - 

o .2 ^ 

^ >^ >^ 

2 ^ ^ 

3 C 

<: 6 PQ 



3 

o u 

c 
o 



^ 6 



c 
o 

i 

•43 

on 
1 



60 
^3 

o 



3 

c« 

<u 

60 
3 

O 3 



60 
3 

■3 



S o .-a 
2 o 

S 



ID 



O _3 



60 
3 



o 

o U =3 



< pq •S 



< pq S 13 § 



3 

3 cj 



o 



60 



> 

O O 



American English, formaiity, naturalness 

■k-k 
I 

I — II — I r 

15 



i 



greeting qa disagree apology 



Arabic, formality, naturainess 



greeting qa disagree apology 



American Englisti, formality, openness 




greeting qa disagree apology 



Arabic, formality, openness 




greeting qa disagree apology 



American English, formality, consclenciousness 



greeting qa disagree apology 



Arabic, formality, consclenciousness 




greeting qa disagree apology 



American English, hedging, agreeableness 




greeting qa disagree apology 



Arabic, hedging, agreeableness 




greeting qa disagree apology 



Figure 1: A subset of data comparing scores on the Big Five personality traits and naturalness as given by native 
speakers of American English (left half of the page) and Arabic (right half of the page). Blue, white, and pink bars 
correspond to negative, neutral, and positive valences of the linguistic features respectively. Dialogue acts listed along 
the horizontal axis are a greeting, question-answer pair, disagreement, and apology. Error bars the 95% confidence 
intervals, brackets above the plots correspond to p-values of paired t-tests at significance levels of 0.05 (*) and 0.01 
(**) after Bonferroni correction. 



